Global Convergence of Policy Gradient Primal–Dual Methods for Risk-Constrained LQRs

نویسندگان

چکیده

While the techniques in optimal control theory are often model-based, policy optimization (PO) approach directly optimizes performance metric of interest. Even though it has been an essential for reinforcement learning problems, there is little theoretical understanding its performance. In this article, we focus on risk-constrained linear quadratic regulator problem via PO approach, which requires addressing a challenging nonconvex constrained problem. To solve it, first build our earlier result that time-invariant affine structure to show associated Lagrangian function coercive, locally gradient dominated, and local Lipschitz continuous gradient, based establish strong duality. Then, design primal–dual methods with global convergence guarantees both model-based sample-based settings. Finally, use samples system trajectories simulations validate methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Global Convergence of Policy Gradient Methods for Linearized Control Problems

Direct policy gradient methods for reinforcement learning and continuous control problems are a popular approach for a variety of reasons: 1) they are easy to implement without explicit knowledge of the underlying model 2) they are an “end-to-end” approach, directly optimizing the performance metric of interest 3) they inherently allow for richly parameterized policies. A notable drawback is th...

متن کامل

Global Convergence Properties of Conjugate Gradient Methods for Optimization

This paper explores the convergence of nonlinear conjugate gradient methods without restarts, and with practical line searches. The analysis covers two classes of methods that are globally convergent on smooth, nonconvex functions. Some properties of the Fletcher-Reeves method play an important role in the first family, whereas the second family shares an important property with the Polak-Ribir...

متن کامل

conditional copula-garch methods for value at risk of portfolio: the case of tehran stock exchange market

ارزش در معرض ریسک یکی از مهمترین معیارهای اندازه گیری ریسک در بنگاه های اقتصادی می باشد. برآورد دقیق ارزش در معرض ریسک موضوع بسیارمهمی می باشد و انحراف از آن می تواند موجب ورشکستگی و یا عدم تخصیص بهینه منابع یک بنگاه گردد. هدف اصلی این مطالعه بررسی کارایی روش copula-garch شرطی در برآورد ارزش در معرض ریسک پرتفویی متشکل از دو سهام می باشد و ارزش در معرض ریسک بدست آمده با روشهای سنتی برآورد ارزش د...

Global Convergence of Conjugate Gradient Methods without Line Search

Global convergence results are derived for well-known conjugate gradient methods in which the line search step is replaced by a step whose length is determined by a formula. The results include the following cases: 1. The Fletcher-Reeves method, the Hestenes-Stiefel method, and the Dai-Yuan method applied to a strongly convex LC objective function; 2. The Polak-Ribière method and the Conjugate ...

متن کامل

Gradient Convergence in Gradient Methods

For the classical gradient method xt+1 = xt − γt∇f(xt) and several deterministic and stochastic variants, we discuss the issue of convergence of the gradient sequence ∇f(xt) and the attendant issue of stationarity of limit points of xt. We assume that ∇f is Lipschitz continuous, and that the stepsize γt diminishes to 0 and satisfies standard stochastic approximation conditions. We show that eit...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Automatic Control

سال: 2023

ISSN: ['0018-9286', '1558-2523', '2334-3303']

DOI: https://doi.org/10.1109/tac.2023.3234176